Accurate Prediction of Translation Initiation Sites by Universum SVM

نویسندگان

  • Tingting Gao
  • Yingjie Tian
  • Xiaojian Shao
  • Naiyang Deng
چکیده

In order to extract protein sequences from nucleotide sequences, it is an important step to recognize points at which regions that start code for proteins. These points are called translation initiation sites (TIS). The task of recognizing TIS can be modeled as a classification problem. In this paper, we use a new pattern classification algorithm which has recently been proposed by Vapnik to deal with this problem. Numerical experiments proved the considerable improvement of this method compared with the leading existing approaches.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Practical Analysis of the Universum SVM Learning

The idea of ‘inference through contradictions’ was introduced by Vapnik[1] in order to incorporate a priori knowledge into the learning process. This knowledge is introduced via additional unlabeled data samples (called virtual examples or the Universum) that are used along with labeled training samples, to perform an inductive inference. For example, if the goal of learning is to discriminate ...

متن کامل

Universum Learning for Multiclass SVM

We introduce Universum learning [1], [2] for multiclass problems and propose a novel formulation for multiclass universum SVM (MU-SVM). We also propose a span bound for MU-SVM that can be used for model selection thereby avoiding resampling. Empirical results demonstrate the effectiveness of MU-SVM and the proposed bound.

متن کامل

Ensemble Universum SVM Learning for Multimodal Classification of Alzheimer's Disease

Recently, machine learning methods (e.g., support vector machine (SVM)) have received increasing attentions in neuroimaging-based Alzheimer’s disease (AD) classification studies. For classifying AD patients from normal controls (NC), standard SVM trains a classification model from only AD and NC subjects. However, in practice besides AD and NC subjects, there may also exist other subjects such ...

متن کامل

Accurate Splice Site Detection for Caenorhabditis elegans

We propose a new system for predicting the splice form of Caenorhabditis elegans genes. As a first step we generate a clean set of genes from available exressed sequence tags (EST) and complete complementary (cDNA) sequences. From all such genes we then generate potential acceptor and donor sites as they would be required by any gene finder. This leads to a clean set of true and decoy splice si...

متن کامل

Empirical Study of the Universum SVM Learning for High-Dimensional Data

Many applications of machine learning involve sparse highdimensional data, where the number of input features is (much) larger than the number of data samples, d À n. Predictive modeling of such data is very ill-posed and prone to overfitting. Several recent studies for modeling high-dimensional data employ new learning methodology called Learning through Contradictions or Universum Learning du...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008